Goto

Collaborating Authors

 active estimation


ActiveCQ: Active Estimation of Causal Quantities

arXiv.org Machine Learning

Estimating causal quantities (CQs) typically requires large datasets, which can be expensive to obtain, especially when measuring individual outcomes is costly. This challenge highlights the importance of sample-efficient active learning strategies. To address the narrow focus of prior work on the conditional average treatment effect, we formalize the broader task of Actively estimating Causal Quantities (ActiveCQ) and propose a unified framework for this general problem. Built upon the insight that many CQs are integrals of regression functions, our framework models the regression function with a Gaussian Process. For the distribution component, we explore both a baseline using explicit density estimators and a more integrated method using conditional mean embeddings in a reproducing kernel Hilbert space. This latter approach offers key advantages: it bypasses explicit density estimation, operates within the same function space as the GP, and adaptively refines the distributional model after each update. Our framework enables the principled derivation of acquisition strategies from the CQ's posterior uncertainty; we instantiate this principle with two utility functions based on information gain and total variance reduction. A range of simulated and semi-synthetic experiments demonstrate that our principled framework significantly outperforms relevant baselines, achieving substantial gains in sample efficiency across a variety of CQs.


Active Estimation of F-Measures

Neural Information Processing Systems

We address the problem of estimating the F-measure of a given model as accurately as possible on a fixed labeling budget. This problem occurs whenever an estimate cannot be obtained from held-out training data; for instance, when data that have been used to train the model are held back for reasons of privacy or do not reflect the test distribution. In this case, new test instances have to be drawn and labeled at a cost. An active estimation procedure selects instances according to an instrumental sampling distribution. An analysis of the sources of estimation error leads to an optimal sampling distribution that minimizes estimator variance.


Active Estimation of F-Measures

Neural Information Processing Systems

We address the problem of estimating the F-measure of a given model as accurately as possible on a fixed labeling budget. This problem occurs whenever an estimate cannot be obtained from held-out training data; for instance, when data that have been used to train the model are held back for reasons of privacy or do not reflect the test distribution. In this case, new test instances have to be drawn and labeled at a cost. An active estimation procedure selects instances according to an instrumental sampling distribution. An analysis of the sources of estimation error leads to an optimal sampling distribution that minimizes estimator variance.


Active Estimation of F-Measures

Neural Information Processing Systems

We address the problem of estimating the F-measure of a given model as accurately as possible on a fixed labeling budget. This problem occurs whenever an estimate cannot be obtained from held-out training data; for instance, when data that have been used to train the model are held back for reasons of privacy or do not reflect the test distribution. In this case, new test instances have to be drawn and labeled at a cost. An active estimation procedure selects instances according to an instrumental sampling distribution. An analysis of the sources of estimation error leads to an optimal sampling distribution that minimizes estimator variance. We explore conditions under which active estimates of F-measures are more accurate than estimates based on instances sampled from the test distribution.